Picture for Zhenting Wang

Zhenting Wang

MemGym: a Long-Horizon Memory Environment for LLM Agents

Add code
May 20, 2026
Viaarxiv icon

UniVL: Unified Vision-Language Embedding for Spatially Grounded Contextual Image Generation

Add code
May 20, 2026
Viaarxiv icon

Evidence Over Plans: Online Trajectory Verification for Skill Distillation

Add code
May 09, 2026
Viaarxiv icon

EvoSkills: Self-Evolving Agent Skills via Co-Evolutionary Verification

Add code
Apr 02, 2026
Viaarxiv icon

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Add code
Mar 04, 2026
Viaarxiv icon

TokenSeek: Memory Efficient Fine Tuning via Instance-Aware Token Ditching

Add code
Jan 27, 2026
Viaarxiv icon

Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety

Add code
Jan 12, 2026
Viaarxiv icon

Read the Scene, Not the Script: Outcome-Aware Safety for LLMs

Add code
Oct 05, 2025
Viaarxiv icon

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Add code
Sep 26, 2025
Figure 1 for EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Figure 2 for EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Figure 3 for EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Figure 4 for EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning
Viaarxiv icon

DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training

Add code
Apr 13, 2025
Viaarxiv icon